Diffraction anisotropy and paired refinement: crystal structure of H33, a protein binder to interleukin 10

The paired refinement procedure was applied on diffraction data from binder H33 corrected for strong diffraction anisotropy.


Introduction
The diffraction quality of a crystal is usually different in various reciprocal space directions. Diffraction anisotropy can be caused by crystal growth, the crystal shape, the modulated volume of the irradiated crystal during the measurement and the arrangement of molecules inside the crystal. This phenomenon is often not a serious issue for a successful structure determination. Most of the macromolecular refinement programs are able to work with weak diffraction anisotropy. But severe diffraction anisotropy may represent a serious threat. The difficulties may appear in the process of phasing and/or structure refinement. However, several computational tools have been developed to analyse or even account for diffraction anisotropy, e.g. AIMLESS (Evans & Murshudov, 2013), STARANISO (Tickle et al., 2018) and Diffraction Anisotropy Server (Strong et al., 2006). These methods perform anisotropic cut-off of the data together with rescaling of intensities or structure factors with scales depending on the analysis and model of anisotropy employed by each program. These modifications are beneficial for a large number of crystal structures and are reported in Section 2 (Rupp, 2018).
Paired refinement is a modern method for determining the high-resolution cut-off of diffraction data (Karplus & Diederichs, 2012). For this method, the reference data are selected on the basis of a conservative cut-off [e.g. hI/(I)i < 2]. More and more reflections are added to the model refinement in a stepwise manner, and their positive or negative contribution is evaluated on a number of criteria, mainly R free calculated on the reference data. Recently, the method has been implemented in the program PAIREF (Malý et al., 2020(Malý et al., , 2021. However, the current protocol implemented in PAIREF does not consider the anisotropic diffraction qualities of the crystals. Both the reference data and the evaluated reflections are in the form of spherical shells. We investigated the possibility of combining corrections for diffraction anisotropy with the standard paired refinement approach. The crystals of our target protein H33 showed serious anisotropy in the diffraction qualities. H33 is an artificial protein binder that was selected during a directed evolutionary study (Pham et al., 2021); it is a variant of the protein scaffold derived from the N-terminal domain of the PIH1D1 domain of the R2TP cochaperone complex (PDB entry 4psf; Hořejší et al., 2014). The scaffold was trained using the ribosome display technique to bind human interleukin-10 (IL-10), a cytokine of human innate immunity (El Kasmi et al., 2007). Blocking or potentiating IL-10 signalization by artificially evolved non-antibody binders such as H33 could be an important component of the treatment of inflammatory, malignant and autoimmune diseases in which IL-10 plays a role. However, our understanding of the structural aspects of the binding between the binders and IL-10 is quite limited. So far, we have solved the structure of only one IL-10 binder called J61 (PDB entry 7avc; Pham et al., 2021). Therefore, a newly solved structure of H33 will aid in the design of new potent and selective binders.
In our work, we introduced an approach to perform paired refinement using the anisotropic data. Anisotropic scaling proved to have a positive impact on the quality of the observed electron density.

Protein production and crystallization
Protein production, purification and basic characterization have been described previously (Pham et al., 2021). Briefly, the synthesized DNA strings were cloned into the pET-26b(+) vector. The plasmid was used for transformation into the Escherichia coli strain BL21(DE3). The bacteria were grown in LB medium, and protein expression was induced by the addition of isopropyl-beta-d-thiogalactopyranoside. After cell disruption, the soluble fraction was separated by centrifugation and the protein was purified from the cell lysate by affinity chromatography using Strep-Tactin XT resin. The last purification step was performed using size-exclusion chromatography (Superdex 75 16/600 column).
The crystals were prepared using the hanging-drop vapourdiffusion method from a protein solution that contained 20 mM Tris, 100 mM NaCl pH 8.0 and the protein at a concentration of 10 mg ml À1 . The protein crystallized in a wide range of crystallization conditions. However, the crystals diffracted poorly. The final crystallization conditions were 1 M (NH 4 ) 2 SO 4 , 1%(w/v) PEG 3350, 0.1 M bis-Tris pH 5.5. Cryoprotection with 20%(v/v) glycerol was necessary before flashfreezing in liquid nitrogen.

Diffraction data collection and processing
The synchrotron data were collected on beamline P13 (Cianci et al., 2017) operated by EMBL Hamburg at the PETRA III storage ring (DESY, Hamburg, Germany). The diffraction images were processed with XDS (Kabsch, 2010) up to 2.3 Å resolution. The data quality metrics [decrease in hI/(I)i, decrease in CC 1/2 ] indicate radiation damage that started immediately after 180 of total oscillation and progressed to the end of the measurement. Therefore, only half of the images (3600) were used for further data evaluation. Such data treatment should remove the possible impact of absorbed dose on the resulting diffraction anisotropy. Initial scaling of the data was performed using AIMLESS (Evans & Murshudov, 2013). The data were severely anisotropic according to a number of indicators. For example, the estimates of the diffraction limits reported by AIMLESS [based on criterion for hI/(I)i > 1.5 in the highest-resolution shell] were 3.28 and 2.65 Å along the hk plane and the l axis, respectively.
Due to severe diffraction anisotropy, the unmerged scaled data from XDS (XDS_ASCII.HKL file) were merged and corrected for anisotropy using the STARANISO server (Tickle et al., 2018) with four different local spherical hI mean / (I mean )i cut-offs going down from 1.2 (STARANISO default value, A1.2 data) to 1.0 (A1.0 data), 0.75 (A0.75 data) and the lowest available value 0.5 (A0.5 data). The free flags were generated with the program FREERFLAG (Brü nger, 1992). Initially, free flags were generated for the A1.2 data. The free flags for the A1.0 data were generated with the option to copy already existing flags for reflections in the A1.2 data. A similar approach was used for the generation of free flags for the A0.75 and A0.5 data. This approach was necessary to maintain the pairwise consistency of the free flags within the different data. Data quality indicators are shown in Table 1.
The phase problem was solved by molecular replacement using PHASER (McCoy et al., 2007) employing the J61 variant of the protein binder from the same directed evolutionary study (PDB entry 7avc; Pham et al., 2021) as a search model. Data with the highest hI mean /(I mean )i cutoff (A1.2) were used. Two molecules were found in the asymmetric unit. Due to the low resolution of the data and the unstable refinement (unacceptable number of Ramachandran outliers and bad bond angles) in REFMAC5 , the structure was restrained to the original scaffold of the PDB entry 4psf refined at 1.58 Å (Hořejší et al., 2014) with PROSMART (Nicholls et al., 2012). The structure was refined with isotropic atomic displacement parameters (ADPs) and no TLS domains defined. Manual corrections to the model were performed with Coot (Emsley et al., 2010).
For the manually launched paired refinement, the data with the highest cut-off (A1.2) were initially chosen. Refinement of the structure model restrained to the structure of PDB entry 4psf with REFMAC5 was used. To keep the same refinement scheme as used previously, three cycles were performed in each paired refinement step. We also performed several manual paired refinements with ten cycles of refinement. Although the results differ in exact values, this change did not lead to a different decision on data usage. Several criteria were evaluated during the paired refinement. Mainly, drops in overall R work and R free were monitored. In addition to that, R free in the highest-resolution shell did not exceed the value of 0.42 (the theoretically perfect model gives an R value of 0.42 short communications against random data with no twinning and no translational non-crystallographic symmetry; Evans & Murshudov, 2013), and the values of CC work and CC free did not exceed the value of CC* (see Table 1). The main results of the paired refinement are shown in Table 2. The decrease in R work and R free values in all three steps indicates that the addition of the progressively weaker reflections improved the model quality against the same (stronger) data. Therefore, A0.5 data were used in further structure refinements. The exact values of the final R work and R free in the fifth column of Table 2 cannot be directly compared because they were calculated against different data. Although the differences in the R work and R free values can be considered marginal, they are comparable to values published in previous studies (Karplus & Diederichs, 2012;Malý et al., 2020Malý et al., , 2021. The final model refinement using ten cycles was carried out using all reflections (work and free) of the A0.5 dataset. Jelly body protocol was used to release the previously used and necessary restraints. The quality of the structure stereochemistry was checked using the validation tools in Coot (Emsley et al., 2010), CCP4 (Agirre et al., 2023;Winn et al., 2011), MOLPROBITY (Chen et al., 2010) and the Protein Data Bank (Berman et al., 2003). The quality indicators of the final structure refinement are shown in Table 3. Raw diffraction data are available from https://doi.org/10.5281/zenodo. 4033811. The structure coordinates were deposited under PDB entry 8bdu.
For analysis of the additional value of anisotropic scaling along with paired refinement in terms of data quality and observed electron density, data processed in the standard (isotropic) way were used in paired refinement with a 2.9 Å starting resolution. The complete cross-validation procedure implemented in PAIREF extended the resolution to 2.8 Å .   Progress of paired refinement using data with a continuously decreasing hI/(I)i cutoff.
Initial R values in X!Y steps are calculated using the model refined with data X against data X. Final R values are calculated using the model refined with data Y against data X.

Results and discussion
The artificially generated binder H33 was successfully crystallized and the diffraction data were collected. The crystal diffracted anisotropically, and correction of the intensities for diffraction anisotropy was performed. The crystal structure was solved and refined using the A1.2 data. The high-resolution diffraction limit was extended using the paired refinement procedure to that of the A0.5 data. The A0.5 data were used for final structure refinement.
The structure of binder H33 is highly similar to that of binder J61 (Pham et al., 2021) from the same study. The root mean square deviation calculated on 128 C atoms is lower than 1.2 Å . The structure has an unusually high solvent content of 76% (Kantardjieff & Rupp, 2003). This high solvent content is present in <2% of the crystal structures in the PDB. The solvent content is probably responsible for the low diffraction quality of the crystals.
Previous studies have shown that diffraction anisotropy is not strictly dependent on crystal packing (Robert et al., 2017). The molecules in the crystal of binder H33 are arranged in tubules perpendicular to the z axis of the crystal lattice. The tubules have large channels of solvent between them. The planes with the normal vector perpendicular to the z axis are the least occupied with molecules [see Fig. 1(b)]. In contrast, no large channels are present in the planes with normal vectors perpendicular to the x or y axis.
Using the data range according to paired refinement may result in an improvement of the observed electron density (Karplus & Diederichs, 2015;Malý et al., 2020).  .15 e Å À3 calculated for the model before the paired refinement procedure using data A1.2, the output model from the paired refinement procedure using the A0.5 data and the model from the isotropic paired refinement procedure using the Iso data, respectively. (g)-(i) Residue Leu75 (chain A) with the 2mF o À DF c electron density at the 1 level for the same combination of model versus data as in the previous triplicate. Graphics were generated with CCP4MG (McNicholas et al., 2011). the anisotropy in the diffraction qualities was also shown to improve the observed electron density (Tickle et al., 2018). Paired refinement using shells reflecting the diffraction anisotropy is not automated in any pipeline. The available software, for example PAIREF (Malý et al., 2020) and PDB-REDO (Joosten et al., 2014), use the addition of reflections in spherical shells by increasing the spherical high-resolution diffraction limit. Here, we propose the addition of reflections with the same expected information content in non-spherical shells.
The quality of the electron density depends on the diffraction data and the structural model. In our analysis, we compared electron density maps of (i) the starting model for the paired refinement refined using the A1.2 data against the A1.2 data, (ii) the resulting model from the paired refinement refined using the A0.5 data against the A0.5 data and (iii) the optimal model from the paired refinement using the Iso data at 2.8 Å resolution. The electron density maps were calculated with fast Fourier transformation using the same grid spacing to avoid possible bias (Urzhumtsev et al., 2014). The electron density map from the Iso data is the least detailed [see Fig. 1( f)]. Corrections for diffraction anisotropy using the STARANISO server (Tickle et al., 2018) and using the A1.2 data dramatically improved the quality of the observed electron density maps. No differences were observed with the extension of data from A1.2 to A0.5.
The number of reflections in the A0.5 data is approximately equal to that of the Iso data. Apparently, their information content is different. The anisotropic cut-off in the A0.5 data removed a significant portion of noisy reflections. The Iso dataset contains reflections in the weak directions with a low signal-to-noise ratio. Moreover, it does not contain a portion of the strong reflections in the strong directions that are present in the A0.5 data at a resolution higher than 2.8 Å .
In our case, both approaches to data optimization (correction for diffraction anisotropy and paired refinement) have proved useful. Although improvement in observed electron density did not occur after paired refinement of data corrected for diffraction anisotropy, 2676 unique reflections (14.5% from 18 374 reflections in total) were added to the refinement scheme using the A0.5 data. This addition was validated by the decrease in R values (see Table 2).
The current trend in data quality evaluation (paired refinement) is to investigate the 'additional value' of more and more observations involved in model refinement. Conventional indicators of the quality of the diffraction data are no longer relevant for the estimation of the high-resolution diffraction limit. The diffraction anisotropy makes the problem even more difficult. Our crystal structure was determined at a nominal diffraction limit of 2.47 Å . However, closer inspection of the diffraction data statistics shows that the diffraction data become dramatically incomplete at better than 3.13 Å resolution. The highest-resolution shell of reflections has a spherical completeness lower than 15%. This indicator must be considered when comparing structures refined 'with the same diffraction limits'.